Framework to improve parallel job workflow

ABSTRACT

Embodiments of the inventive subject matter include receiving, in a distributed computing environment, a plurality of files for execution. Embodiments further include identifying, by parsing the plurality of files, code segments contained in each of the plurality of files. Embodiments further include determining, based on a comparison of the code segments and definitions contained in a distributed computing basic function library, a first group of the code segments that include configuration tasks and a second group of the code segments that include computational tasks. Embodiments further include combining the first group of the code segments to form a super configuration task. Embodiments further include creating an executable code, wherein the executable code comprises the super configuration task and the second group of code segments. Embodiments further include allocating the executable code to one or more nodes. Embodiments further include executing the executable code on the one or more nodes.

RELATED APPLICATIONS

This application is a Continuation of and claims the priority benefit ofU.S. patent application Ser. No. 14/294,408 filed Jun. 3, 2014, whichclaims the priority under 35 U.S.C. § 119 benefit of China PatentApplication No. 201310267938.5 filed Jun. 28, 2013, which isincorporated by reference in its entirety.

FIELD

Embodiments of the present invention relate to distributed computing,and more specifically, to a method and apparatus for managing multiplejobs in a distributed computing system.

BACKGROUND

With the development of computer hardware and software technology, theemergence of computer clusters provides a more efficient data computingpower while improving the computing performance of separate computers.Based on distributed computing technology, one or more jobs may bedivided into multiple parallel executable tasks, and these tasks may beallocated to one or more processing units (e.g. processor cores) atmultiple computing nodes in a distributed computing system forexecution. The performance of distributed computing technology dependson, to a greater extent, how to schedule and manage these tasks. Taskscheduling and management may be implemented by transmitting varioustypes of control data among respective tasks.

So far, providers of distributed computing technology have developedkinds of basic function libraries capable of supporting distributedcomputing, where there are defined kinds of basic functions forscheduling and managing parallel tasks. Therefore, independent softwarevendors (ISVs) in each industry do not have to develop basicfunctionality supporting distributed computing again. Instead,independent software vendors may develop applications suitable for theirindustries by invoking functions in basic function libraries. Forexample, a software vendor in the weather forecasting field may developapplications for weather forecasting based on a basic function library,and a software vendor in the data mining field may develop applicationsfor data analysis based on the basic function library.

Typically the complexity of existing distributed computing systemsrequires mutual cooperation between multiple applications so as toachieve a computing job. Subjobs of a large computing job have mutualdependences and are in chronological sequence. These sub jobs jointlyform a workflow, where multiple applications from one or moreindependent software vendors might be involved. However, a user does nothave source code of these applications but only executable code;therefore, the user can only execute these applications by invokingexecutable code, which prevents further optimization with respect to theoverall performance of respective applications.

Usually each application comprises tasks associated with task managementand scheduling. For example, task Allocate may allocate variousresources to multiple tasks comprised in an application while theapplication is running initially, and task Release may release allallocated resources at the end of application running. Supposeapplications App-A and App-B are serially executed, then a phenomenonmight occur as below: various resources that have been released byrelease task Release-A of application App-A are allocated to applicationApp-B by allocate task Allocate-B of application App-B. Note tasks suchas resource allocation and release do not directly contribute to thecomputation of an application but are used for assisting in theexecution of the application, so the ratio of the execution time formanaging and scheduling tasks to the entire application becomes animportant factor affecting the operation efficiency of the application.

The time for allocating and releasing resources increases as the numberof computing nodes in a distributed computing system increases. With thedevelopment of distributed computing systems, the magnitude order ofcomputing nodes has increased from dozens to hundreds or even more,which results in the operation efficiency of jobs in distributedcomputing systems trends to decrease to some extent. At this point,improving the operation efficiency of distributed computing systemscurrently becomes a hot issue of research.

SUMMARY

Therefore, it is desired to develop a technical solution capable ofmanaging multiple jobs in a distributed computing system, and it isdesired the technical solution can manage the multiple jobs and thusimprove the operation efficiency of the multiple jobs in the distributedcomputing system, without retrieving source code of each job. To thisend, the various embodiments of the present invention provide a methodand apparatus for managing multiple jobs in a distributed computingsystem.

According to one aspect of the present invention, there is provided amethod for managing multiple jobs in a distributed computing systemaccording to one embodiment of the present invention, the methodcomprising: dividing, in response to having received multiple jobs,multiple tasks comprised in each of the multiple jobs into configurationtasks and computation tasks, wherein the each of the multiple jobs is anexecutable program; combining the configuration tasks associated withthe multiple jobs into a super configuration task; merging the multiplejobs into a super job based on the super configuration task and thecomputation tasks; and executing the super configuration task and thecomputation tasks comprised in the super job by using multiple computingnodes in a distributed computing environment.

According to one aspect of the present invention, the superconfiguration task is executed only once.

According to one aspect of the present invention, the executable programis written based on a distributed computing basic function library.

According to another aspect of the present invention, there is providedan apparatus for managing multiple jobs in a distributed computingsystem, the apparatus comprising: a dividing module configured todivide, in response to having received multiple jobs, multiple taskscomprised in each of the multiple jobs into configuration tasks andcomputation tasks, wherein the each of the multiple jobs is anexecutable program; a combining module configured to combine theconfiguration tasks associated with the multiple jobs into a superconfiguration task; a merging module configured to merge the multiplejobs into a super job based on the super configuration task and thecomputation tasks; and an executing module configured to execute thesuper configuration task and the computation tasks comprised in thesuper job by using multiple computing nodes in a distributed computingenvironment, wherein the each of the multiple jobs is an executableprogram.

According to one aspect of the present invention, the superconfiguration task is executed only once.

According to one aspect of the present invention, the executable programis written based on a distributed computing basic function library.

Using the method and apparatus described in the present invention, tasksinvolved in a respective job can be managed and scheduled withoutretrieving source code of the jobs; and the operation efficiency of therespective job in the distributed computing system can be improved asmuch as possible.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

Through the more detailed description of some embodiments of the presentdisclosure in the accompanying drawings, the above and other objects,features and advantages of the present disclosure will become moreapparent, wherein the same reference generally refers to the samecomponents in the embodiments of the present disclosure.

FIG. 1 schematically shows an exemplary computer system/server 12 whichis applicable to implement the embodiments of the present invention;

FIG. 2 schematically shows a sequence diagram 200 of a method forexecuting multiple jobs comprising parallel tasks according to onesolution;

FIG. 3 schematically shows a flowchart of a method for managing multiplejobs in a distributed computing system according to one embodiment ofthe present invention;

FIG. 4 schematically shows a sequence diagram 400 of a method forexecuting a super task according to one embodiment of the presentinvention;

FIG. 5 schematically shows a mapping relationship between source code510 and executable code 520 associated with a job;

FIG. 6 schematically shows a schematic view of principles for executinga job 610 comprising parallel tasks according to one embodiment of thepresent invention;

FIG. 7 schematically shows a flowchart of a method for executing a supertask according to one embodiment of the present invention; and

FIG. 8 schematically shows a block diagram 800 of an apparatus formanaging multiple jobs in a distributed computing system according toone embodiment of the present invention.

DETAILED DESCRIPTION

Some preferable embodiments will be described in more detail withreference to the accompanying drawings, in which the preferableembodiments of the present disclosure have been illustrated. However,the present disclosure can be implemented in various manners, and thusshould not be construed to be limited to the embodiments disclosedherein. To the contrary, those embodiments are provided for the thoroughand complete understanding of the present disclosure, and completelyconveying the scope of the present disclosure to those skilled in theart.

The present invention may be a system, a method, and/or a computerprogram product. The computer program product may include a computerreadable storage medium (or media) having computer readable programinstructions thereon for causing a processor to carry out aspects of thepresent invention.

The computer readable storage medium can be a tangible device that canretain and store instructions for use by an instruction executiondevice. The computer readable storage medium may be, for example, but isnot limited to, an electronic storage device, a magnetic storage device,an optical storage device, an electromagnetic storage device, asemiconductor storage device, or any suitable combination of theforegoing. A non-exhaustive list of more specific examples of thecomputer readable storage medium includes the following: a portablecomputer diskette, a hard disk, a random access memory (RAM), aread-only memory (ROM), an erasable programmable read-only memory (EPROMor Flash memory), a static random access memory (SRAM), a portablecompact disc read-only memory (CD-ROM), a digital versatile disk (DVD),a memory stick, a floppy disk, a mechanically encoded device such aspunch-cards or raised structures in a groove having instructionsrecorded thereon, and any suitable combination of the foregoing. Acomputer readable storage medium, as used herein, is not to be construedas being transitory signals per se, such as radio waves or other freelypropagating electromagnetic waves, electromagnetic waves propagatingthrough a waveguide or other transmission media (e.g., light pulsespassing through a fiber-optic cable), or electrical signals transmittedthrough a wire.

Computer readable program instructions described herein can bedownloaded to respective computing/processing devices from a computerreadable storage medium or to an external computer or external storagedevice via a network, for example, the Internet, a local area network, awide area network and/or a wireless network. The network may comprisecopper transmission cables, optical transmission fibers, wirelesstransmission, routers, firewalls, switches, gateway computers and/oredge servers. A network adapter card or network interface in eachcomputing/processing device receives computer readable programinstructions from the network and forwards the computer readable programinstructions for storage in a computer readable storage medium withinthe respective computing/processing device.

Computer readable program instructions for carrying out operations ofthe present invention may be assembler instructions,instruction-set-architecture (ISA) instructions, machine instructions,machine dependent instructions, microcode, firmware instructions,state-setting data, or either source code or object code written in anycombination of one or more programming languages, including an objectoriented programming language such as Java, Smalltalk, C++ or the like,and conventional procedural programming languages, such as the “C”programming language or similar programming languages. The computerreadable program instructions may execute entirely on the user'scomputer, partly on the user's computer, as a stand-alone softwarepackage, partly on the user's computer and partly on a remote computeror entirely on the remote computer or server. In the latter scenario,the remote computer may be connected to the user's computer through anytype of network, including a local area network (LAN) or a wide areanetwork (WAN), or the connection may be made to an external computer(for example, through the Internet using an Internet Service Provider).In some embodiments, electronic circuitry including, for example,programmable logic circuitry, field-programmable gate arrays (FPGA), orprogrammable logic arrays (PLA) may execute the computer readableprogram instructions by utilizing state information of the computerreadable program instructions to personalize the electronic circuitry,in order to perform aspects of the present invention.

Aspects of the present invention are described herein with reference toflowchart illustrations and/or block diagrams of methods, apparatus(systems), and computer program products according to embodiments of theinvention. It will be understood that each block of the flowchartillustrations and/or block diagrams, and combinations of blocks in theflowchart illustrations and/or block diagrams, can be implemented bycomputer readable program instructions.

These computer readable program instructions may be provided to aprocessor of a general purpose computer, special purpose computer, orother programmable data processing apparatus to produce a machine, suchthat the instructions, which execute via the processor of the computeror other programmable data processing apparatus, create means forimplementing the functions/acts specified in the flowchart and/or blockdiagram block or blocks. These computer readable program instructionsmay also be stored in a computer readable storage medium that can directa computer, a programmable data processing apparatus, and/or otherdevices to function in a particular manner, such that the computerreadable storage medium having instructions stored therein comprises anarticle of manufacture including instructions which implement aspects ofthe function/act specified in the flowchart and/or block diagram blockor blocks.

The computer readable program instructions may also be loaded onto acomputer, other programmable data processing apparatus, or other deviceto cause a series of operational steps to be performed on the computer,other programmable apparatus or other device to produce a computerimplemented process, such that the instructions which execute on thecomputer, other programmable apparatus, or other device implement thefunctions/acts specified in the flowchart and/or block diagram block orblocks.

The flowchart and block diagrams in the Figures illustrate thearchitecture, functionality, and operation of possible implementationsof systems, methods, and computer program products according to variousembodiments of the present invention. In this regard, each block in theflowchart or block diagrams may represent a module, segment, or portionof instructions, which comprises one or more executable instructions forimplementing the specified logical function(s). In some alternativeimplementations, the functions noted in the block may occur out of theorder noted in the figures. For example, two blocks shown in successionmay, in fact, be executed substantially concurrently, or the blocks maysometimes be executed in the reverse order, depending upon thefunctionality involved. It will also be noted that each block of theblock diagrams and/or flowchart illustration, and combinations of blocksin the block diagrams and/or flowchart illustration, can be implementedby special purpose hardware-based systems that perform the specifiedfunctions or acts or carry out combinations of special purpose hardwareand computer instructions.

Referring now to FIG. 1, in which a block diagram of an exemplarycomputer system/server 12 which is applicable to implement theembodiments of the present invention is illustrated. Computersystem/server 12 illustrated in FIG. 1 is only illustrative and is notintended to suggest any limitation as to the scope of use orfunctionality of embodiments of the invention described herein.

As illustrated in FIG. 1, computer system/server 12 is illustrated inthe form of a general-purpose computing device. The components ofcomputer system/server 12 may include, but are not limited to, one ormore processors or processing units 16, a system memory 28, and a bus 18that couples various system components including the system memory 28and processing units 16.

Bus 18 represents one or more of several types of bus structures,including a memory bus or memory controller, a peripheral bus, anaccelerated graphics port, and a processor or local bus using any of avariety of bus architectures. By way of example, and not limitation,such architectures include Industry Standard Architecture (ISA) bus,Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, VideoElectronics Standards Association (VESA) local bus, and PeripheralComponent Interconnect (PCI) bus.

Computer system/server 12 typically includes a variety of computersystem readable media. Such media may be any available media that isaccessible by computer system/server 12, and it includes both volatileand non-volatile media, removable and non-removable media.

System memory 28 can include computer system readable media in the formof volatile memory, such as random access memory (RAM) 30 and/or cachememory 32. Computer system/server 12 may further include otherremovable/non-removable, volatile/non-volatile computer system storagemedia. By way of example only, storage system 34 can be provided forreading from and writing to a non-removable, non-volatile magnetic media(not illustrated in FIG. 1 and typically called a “hard drive”).Although not illustrated in FIG. 1, a magnetic disk drive for readingfrom and writing to a removable, non-volatile magnetic disk (e.g., a“floppy disk”), and an optical disk drive for reading from or writing toa removable, non-volatile optical disk such as a CD-ROM, DVD-ROM orother optical media can be provided. In such instances, each drive canbe connected to bus 18 by one or more data media interfaces. As will befurther depicted and described below, memory 28 may include at least oneprogram product having a set (e.g., at least one) of program modulesthat are configured to carry out the functions of embodiments of thepresent invention.

Program/utility 40, having a set (at least one) of program modules 42,may be stored in memory 28 by way of example, and not limitation, aswell as an operating system, one or more application programs, otherprogram modules, and program data. Each of the operating system, one ormore application programs, other program modules, and program data orsome combination thereof, may include an implementation of a networkingenvironment. Program modules 42 generally carry out the functions and/ormethodologies of embodiments of the present invention as describedherein.

Computer system/server 12 may also communicate with one or more externaldevices 14 such as a keyboard, a pointing device, a display 24, etc.;one or more devices that enable a user to interact with computersystem/server 12; and/or any devices (e.g., network card, modem, etc.)that enable computer system/server 12 to communicate with one or moreother computing devices. Such communication can occur via Input/Output(I/O) interfaces 22. Still yet, computer system/server 12 cancommunicate with one or more networks such as a local area network(LAN), a general wide area network (WAN), and/or a public network (e.g.,the Internet) via network adapter 20. As depicted, network adapter 20communicates with the other components of computer system/server 12 viabus 18. It should be understood that although not illustrated, otherhardware and/or software components could be used in conjunction withcomputer system/server 12. Examples, include, but are not limited to:microcode, device drivers, redundant processing units, external diskdrive arrays, RAID systems, tape drives, and data archival storagesystems, etc.

FIG. 2 schematically shows a sequence diagram 200 of a method forexecuting multiple jobs comprising parallel tasks according to onesolution. In FIG. 2 there are shown three jobs, i.e. a job1 210, a job2220 and a job3 230, wherein the horizontal axis represents the timeaxis. Note in a distributed computing system, one job may comprisemultiple tasks at least one part of which may be executed in parallel.While a job is being launched, there might be involved a task forscheduling multiple tasks and a task for initializing various networkconfigurations in a distributed computing environment; and before thejob ends, there might further be involved a task for releasing networkresources. In some embodiments, the word “jobs” refers to filesincluding executable code to be executed on a computer system.

FIG. 2 shows a sequence diagram of executing tasks in a respective job.Specifically, LT represents loading time, i.e. time for loading acorresponding job; NIT represents network initializing time, i.e. timefor initializing various network resources that might be involved duringexecuting a corresponding job; NCT represents network clear-up time,i.e., time that is spent clearing up network resources occupied by acorresponding job; and RT represents run time, i.e. time actually forrunning tasks associated with a computation object of a correspondingjob.

As shown in FIG. 2, job 1 210, job 2 220 and job 3 230 are seriallyexecuted. At this point, during executing each job the followingoperations need to be performed in order: loading operation, networkinitialization, run operation, and network resource clear-up (forexample, with respect to job 1 210, times occupied by these operationscorrespond to LT1 212, NIT1 214, RT1 216 and NCT1 218, respectively).With respect to job 2 220 and job 3 230, times occupied by variousoperations are also similar to the time allocation in job 1 210.

As seen from FIG. 2, while executing each job, only shaded time periodsRT1 216, RT2 226 and RT3 236 are times actually used for executing tasksassociated with a computation object of the job, whereas other timeoverheads (i.e. LT+MT+NCT) take a large proportion of total timeoverheads for executing the respective jobs. Therefore, it is desired toincrease the proportion of runtime to total time overheads for executingjobs and further improve the efficiency of executing multiple jobs in adistributed computing environment.

In view of the above drawback in existing solutions, one embodiment ofthe present invention provides a technical solution for managingmultiple jobs in a distributed computing system. Generally, thetechnical solution can allocate resources with respect to an entireworkflow all at once, and resources being allocated cause all jobs inthe workflow to be correctly executed according to a logicalrelationship of the workflow; when all jobs in the workflow end, thetechnical solution can release all resources at the same time.

Specifically, the method comprises: dividing, in response to havingreceived multiple jobs, multiple tasks comprised in each of the multiplejobs into a configuration task and a computation task; combiningconfiguration tasks associated with the multiple jobs into a superconfiguration task; merging the multiple jobs into a super job based onthe super configuration task and the computation task; and executing thesuper configuration task and the computation task comprised in the superjob by using multiple computing nodes in a distributed computingenvironment, wherein the each of the multiple jobs is an executableprogram.

FIG. 3 schematically shows a flowchart 300 of a method for managingmultiple jobs in a distributed computing system according to oneembodiment of the present invention. The method starts from step S302,where in response to having received multiple jobs, multiple taskscomprised in each of the multiple jobs are divided into configurationtasks and computation tasks. In some embodiments, files containing codesegments are parsed, and the code segments are identified and groupedaccording to their type as a configuration task or computational taskbased on definitions contained in a basic function library.

As described above, the configuration tasks may represent tasks forscheduling and managing respective tasks in a job, such as a task forlaunching a job and a task for scheduling (e.g. may comprise resourceinitializing and resource release) resources involved in executingmultiple tasks. The computation tasks may be tasks directly related tothe accomplishment of the job's computation object. In this embodiment,multiple tasks comprised in a job may be divided into a configurationtask and a computation task based on characteristics of code (e.g.binary code) of an application.

Note in this embodiment the configuration task resulting from thedividing may be one or more configuration tasks, e.g. may comprise tasksfor initializing and releasing network resources; and the computationtask resulting from the dividing may comprise parallel executable taskscomprised in a respective job. For example, with respect to job 1 210 inFIG. 2, tasks executed during time periods LT1 212, NIT1 214 and NCT1218 may be configuration tasks, tasks executed during time period RT1216 may be computation tasks, and the computation tasks may comprisemultiple tasks executable in parallel, or tasks in a job may be seriallyexecuted. In job 2 220 and job 3 230, types of tasks executed in timeperiods may also be similar to job 1 210.

Note in the various embodiments of the present invention it is notrequired all tasks in a job can be executed in parallel, but only atleast one part of all tasks may be executed in parallel. In thedistributed computing environment, these parallel executable tasks maybe dispatched to computing nodes in the distributed computingenvironment so as to be processed by using computing resources of therespective computing nodes.

Note in the context of the present invention, it is not intended todiscuss how to dispatch multiple parallel executable tasks to multiplecomputing nodes or how to collect intermediate computation resultsobtained from the multiple computing nodes so as to form a finalprocessing result. Those skilled in the art may implement the procedurebased on principles and algorithms of distributed computing.

In step S304, configuration tasks associated with the multiple jobs arecombined into a super configuration task. In existing technicalsolutions, when executing each of multiple jobs, it is necessary toexecute a corresponding configuration task in the job. However,functions of configuration tasks of the multiple jobs are roughlyidentical, and it takes much time to execute the configuration task ofeach job separately. In this step, time overheads for executingconfiguration tasks can be reduced by extracting the same content fromconfiguration tasks associated with the multiple jobs and combiningthese configuration tasks into a super configuration task.

In step S306, the multiple jobs are merged into a super job based on thesuper configuration task and the computation task. Like a conventionaljob, the super job may comprise a configuration task and computationtasks. However, unlike the conventional job, the computation tasks inthe super job do not come from a single job but are formed by combiningcomputation tasks extracted from the multiple jobs; in addition, theconfiguration task in the super job is the super configuration taskgenerated in step S304. Note computation tasks from the multiple jobsmay be serially arranged directly in an order of executing the multiplejobs, as the computation task of the super job.

Specifically, job Job-A and job Job-B should be serially executed in adistributed computing system, and according to the dividing step shownin step S304 in FIG. 3, multiple computation tasks Task-A 1, . . . ,Task-A N and multiple computation tasks Task-B 1, . . . , Task-B M havebeen extracted from Job-A and job Job-B, respectively. At this point,the merged super job may be constructed such that the computation tasksTask-B 1, . . . , Task-B M are executed after executing computationtasks Task-A 1, . . . , Task-A N.

In step S308, the super configuration task and the computation taskcomprised in the super job are executed by using multiple computingnodes in a distributed computing environment, wherein the each of themultiple jobs is an executable program. In this embodiment, the superjob may be executed using multiple computing nodes in a distributedcomputing environment in a manner similar to executing a conventionaljob.

Note the method according to the various embodiments of the presentinvention can conduct performance optimization while jobs are running,so as to improve the execution efficiency of each job. Since source codeof each job cannot be obtained at runtime but only executable programscan be obtained, the optimization during running is conducted based onexecutable programs and does not involve any content regardingprogrammers optimize source code during the development phase.

Specifically, since the super job may comprise parallel executabletasks, at this point these parallel executable tasks may be executed byusing multiple computing nodes. For example, continuing the exampleshown in step S306, when the super task comprises computation tasksTask-A 1, . . . , Task-A N and computation tasks Task-B 1, . . . ,Task-B M, first computation tasks Task-A 1, . . . , Task-A N areexecuted at least partially in parallel by using multiple computingnodes, and then computation tasks Task-B 1, . . . , Task-B M areexecuted at least partially in parallel by using the multiple computingnodes.

At this point, although the exact time for executing these two groups ofcomputation tasks may possibly be in sequential, with respect to allcomputation tasks comprised in the super task, at least one part of themare executed in parallel.

In one embodiment of the present invention, the super configuration taskis executed only once. Note in distributed computing technology,configuration tasks associated with task management and scheduling maybe the same; since multiple configuration tasks have been combined intoa super configuration task in step S304, in this embodiment of thepresent invention the super configuration task only needs to be executedonce.

With respect to the example in FIG. 2, configuration tasks executed intime periods LT1 212, LT2 222 and LT3 232 may be the same, e.g. forexecuting a loading operation. Similarly, configuration tasks executedin time periods NIT1 214, NIT2 224 and NIT3 234 may be identical tasks,e.g. for executing a network initializing operation. In one embodimentof the present invention, these same configuration tasks are combinedinto a super configuration task and are only executed once duringrunning. Illustration is presented below with respect to FIG. 4.

FIG. 4 schematically shows a sequence diagram 400 of a method forexecuting a super task according to one embodiment of the presentinvention. This figure shows a sequence diagram of a super task that isformed after performing the method according to the present inventionwith respect to the multiple jobs shown in FIG. 2. As shown in FIG. 2,all of corresponding loading operation, network initializing operation,running operation and network clear-up operation need to be performedwhile executing each job, whereas only the running operation is used forexecuting tasks directly associated with the job's computation objectand the other operations are auxiliary operations.

Using the method of the present invention, the three jobs shown in FIG.2 may be merged into a super job, and tasks in the super job arescheduled and managed once again. Specifically, as shown in FIG. 4,loading times LT1 212, LT2 222 and LT3 232 of the three jobs shown inFIG. 2 are merged into a loading time LT 412 of a super job 410.Similarly, network initializing times of the three jobs in FIG. 2 aremerged into a network initializing time NIT 414 of super job 410, andnetwork clear-up times of the three jobs in FIG. 2 are merged into anetwork clear-up time NCT 418 of super job 410. Further, run times (i.e.RT1+RT2+RT3) of the three jobs in FIG. 2 are used as a run time of superjob 410 as a whole. As seen from FIG. 4, the proportion of the run timeof the super task to the total execution time is increased greatly, sothe operation efficiency of the distributed computing system isimproved.

In one embodiment of the present invention, the executable program iswritten based on a distributed computing basic function library. Sinceusers of an executable program usually do not have source code of theexecutable program, they cannot optimize the operation efficiency ofeach executable program. Note when an executable program is writtenbased on a distributed computing basic function library, a portionassociated with a configuration task and a computation task may beextracted from code (e.g. binary code) of the executable program.

FIG. 5 schematically shows a mapping relationship 500 between sourcecode 510 and executable code 520 associated with a job. Those skilled inthe art should understand an executable program is formed by compilingsource code. Specifically, FIG. 5 on the left shows source code 510associated with an executable program 520. When writing source code,functions in a distributed computing basic function library may beinvoked. For example, MPI_Init( ) in the basic function library may beinvoked to implement functionality associated with initialization, andMPI_Finalize( ) may be invoked to implement functionality associatedwith resource clear-up. Those skilled in the art should understand thefunctions MPI_Init( ) and MPI_Finalize( ) are illustrative only, and inthe basic function library there may exist other functions associatedwith initialization and resource clear-up.

Executable program 520 on the right of FIG. 5 is an executable programresulting from compiling source code 510. Executable program 520 maytake the form of binary code for example, and the binary code maycomprise code segments corresponding to respective functions in thesource code. For example, a code segment 1 522 may correspond toMPI_Init( ) a code segment 2 524 may correspond to Compute( ) and a codesegment 3 526 may correspond to MPI_Finalize( ). Therefore, concretemeaning of an executable program may be analyzed by parsing codesegments of the executable program. One embodiment of the presentinvention divides multiple tasks in a job into a configuration task anda computation task based on this principle.

Specifically, in one embodiment of the present invention, the dividing,in response to having received multiple jobs, multiple tasks comprisedin each of the multiple jobs into a configuration task and a computationtask comprises: with respect to a current job among the multiple jobs,according to definition of the distributed computing basic functionlibrary, extracting from multiple tasks in the current job at least oneof the following as a configuration task: a scheduling task and anetwork resource management task, the scheduling task being used forlaunching the current job, the network resource management task beingused for managing network resources needed for executing the currentjob; and taking other task than the configuration task among themultiple tasks in the current job as the computation task of the job.

Content of each function has been explicitly defined in the distributedcomputing basic function library, and content of a code segmentassociated with each function is also known. Therefore, according todefinition of the distributed computing basic function library, at leastany of the following may be extracted from multiple tasks in the currentjob as a configuration task: a scheduling task and a network resourcemanagement task.

With respect to the concrete example shown in FIG. 5, when it is definedin the basic function library that MPI_Init( ) and MPI_Finalize( ) aretasks for allocating and releasing network resources, it may be knownthat code segment 1 522 and code segment 3 526 in executable program 520are network resource management tasks. Similarly, those skilled in theart may determine which code segments belong to scheduling tasks basedon definition of functions associated with scheduling tasks in the basicfunction library. Note MPI_Init( ) and MPI_Finalize( ) are illustrativeonly, and they show operations before entering and after exiting a userprogram. With respect to a basic function library provided by adifferent vendor, other one or more functions may represent operationsof allocating and releasing resources.

After determining the configuration task, other task than theconfiguration task among the multiple tasks in the current job may beused as the computation task of the job. With respect to the concreteexample in FIG. 5, it may be determined that code segment 2 524corresponding to Computer( ) belongs to the computation task. Notealthough in FIG. 5 the computation task is represented by Compute( )only, those skilled in the art may understand the computation taskrefers to multiple tasks that are executable at least partially inparallel by various computing nodes in the distributed computing system.

With reference to FIG. 6, description is presented below to how toexecute multiple tasks in parallel. FIG. 6 schematically shows aschematic view 600 of principles for executing a method comprisingparallel tasks according to one embodiment of the present invention. Forexample, a job 610 may comprise multiple tasks, i.e. a task 1 612, atask 2 614, . . . , a task N 616, among which at least one part may beexecuted in parallel.

According to one principle of distributed computing, while executing ajob, each task may have a specific network address. During the initialphase of executing the job, each task may send its network address to aspecific device so as to build a network address table comprisingnetwork addresses of respective tasks (or the specific deviceproactively collects network addresses of respective tasks to build anetwork address table, etc.). Subsequently, multiple tasks maycommunicate with each other via addresses in the network address table,so as to accomplish the job. For example, multiple tasks 1 612, 2 614, .. . , N 616 shown in FIG. 6 communicate with one another via a networkaddress table 620.

In one embodiment of the present invention, the network resourcemanagement task at least comprises a network initialization task and anetwork clear-up task.

In one embodiment of the present invention, the network initializationtask at least comprises: collecting network addresses of variouscomputation tasks so as to form a network address table forcommunication between the various computation tasks. Those skilled inthe art should understand in this embodiment the network initializationtask may comprise, for example, operations of building network addresstable 620 with respect to tasks 1 612 . . . N 616 shown in FIG. 6.

In one embodiment of the present invention, the network clear-up task atleast comprises: clearing up the network address table. Note since thenetwork address table is associated with specific tasks in a job, whenall tasks in the job have been executed, the network address tablebecomes useless and thus needs to be removed. In this embodiment, thenetwork clear-up task may comprise, for example, operations of removingthe network address table.

Note although the network initialization task and the network clear-uptask have been illustrated above in the context of building and removinga network address table respectively, the two tasks may further compriseother operations. For example, the network initialization task mayfurther comprise setting network configurations of various computingnodes in the distributed computing system, etc.

In one embodiment of the present invention, the executing the superconfiguration task and the computation tasks comprised in the super jobby using multiple computing nodes in a distributed computing environmentcomprises: executing the super configuration task serially with thecomputation tasks; and executing the computation tasks at leastpartially in parallel.

On the one hand, the super configuration task functions set anappropriate environment for the computation tasks and reset thecomputing environment upon completion of the computation tasks.Therefore, the super configuration task may be executed serially withthe computation tasks. For example, network resources of the distributedcomputing system may be initialized before executing the computationtasks and then released after executing the computation tasks. On theother hand, the computation tasks may comprise multiple parallelexecutable tasks. Therefore, these computation tasks may be executed atleast partially in parallel by multiple computing nodes according togeneral principles of distributed computing technology.

In one embodiment of the present invention, the executing the superconfiguration task serially with the computation tasks comprises:serially executing the scheduling task, the network initialization task,the computation tasks, and the network clear-up task. Detaileddescription is presented below with reference to FIG. 7.

FIG. 7 schematically shows a flowchart 700 of a method for executing asuper task according to one embodiment of the present invention. First,in step S702 a scheduling task is executed to launch a current job. Instep S704 it is determined whether or not there exists a network addresstable in a distributed computing system. If “No,” then in step S706network addresses of multiple computation tasks in the super task areretrieved so as to form the network address table. Then the operationalflow proceeds to step S708. If “Yes,” then the operational flow proceedsto step S708.

In step S708 communication is conducted between the computation tasks byusing the network address table, so as to execute the computation tasks.When one of the computation tasks is completed, in step S710 it isdetermined whether or not a next computation task exists: if yes, thenthe operational flow returns to step S708. If all the computation tasksare completed, then the operational flow proceeds to step S712 torelease network resources.

Note in the method shown in FIG. 7 steps S704 to S706 belong to thenetwork initialization task, and steps S708 to S710 belong to thecomputation task. Note operations shown in steps S708 to S710 are notperformed at a single computing node but may be executed at multiplecomputing nodes in the distributed computing system. For example, anidle computing node may query whether or not there exists a nextcomputation task; if finding there is a pending computation task A, theidle node may execute computation task A.

In one embodiment of the present invention, the executing thecomputation tasks at least partially in parallel comprises:communicating between the computation tasks by using the network addresstable, so as to execute the computation tasks. The network address tablemay record a network address uniquely identifying each computation taskin the job. For example, a network address of computation task A may berepresented as “AAAA,” while a network address of a computation task Bmay be represented as “BBBB.” When computation task A needs to send amessage to computation task B, it directly transmits data packets to thenetwork address “BBBB.” The computation tasks deliver messages vianetwork addresses in the network address table; as messages aredelivered between corresponding computation tasks, the respectivecomputation tasks are accomplished.

FIG. 8 schematically shows a block diagram 800 of an apparatus formanaging multiple jobs in a distributed computing system according toone embodiment of the present invention. Specifically, there is anapparatus provided for managing multiple jobs in a distributed computingsystem, the apparatus comprising: a dividing module 810 configured todivide, in response to having received multiple jobs, multiple taskscomprised in each of the multiple jobs into configuration tasks andcomputation tasks; a combining module 820 configured to combine theconfiguration tasks associated with the multiple jobs into a superconfiguration task; a merging module 830 configured to merge themultiple jobs into a super job based on the super configuration task andthe computation tasks; and an executing module 840 configured to executethe super configuration task and the computation tasks comprised in thesuper job by using multiple computing nodes in a distributed computingenvironment, wherein each of the multiple jobs is an executable program.

In one embodiment of the present invention, the super configuration taskis executed only once.

In one embodiment of the present invention, the executable program iswritten based on a distributed computing basic function library.

In one embodiment of the present invention, dividing module 810comprises: an extracting module configured to, with respect to a currentjob among the multiple jobs, according to definition of the distributedcomputing basic function library, extract at least one of the followingas a configuration task from multiple tasks in the current job: ascheduling task and a network resource management task, the schedulingtask being used for launching the current job, the network resourcemanagement task being used for managing network resources needed forexecuting the current job; and a specifying module configured to takeother task than the configuration task among the multiple tasks in thecurrent job as the computation task of the job.

In one embodiment of the present invention, the network resourcemanagement task at least comprises a network initialization task and anetwork clear-up task.

In one embodiment of the present invention, executing module 840comprises: a serial executing module configured to execute the superconfiguration task serially with the computation tasks; and a parallelexecuting module configured to execute the computation tasks at leastpartially in parallel.

In one embodiment of the present invention, the serial executing modulecomprises: a first serial module configured to serially execute thescheduling task, the network initialization task, the computation tasks,and the network clear-up task.

In one embodiment of the present invention, the network initializationtask at least comprises: collecting network addresses of variouscomputation tasks so as to form a network address table forcommunication between the various computation tasks.

In one embodiment of the present invention, the network clear-up task atleast comprises: clearing up the network address table.

In one embodiment of the present invention, the parallel executingmodule comprises: a communicating module configured to communicatebetween the computation tasks by using the network address table, so asto execute the computation tasks.

The flowchart and block diagrams in the Figures illustrate thearchitecture, functionality, and operation of possible implementationsof systems, methods and computer program products according to variousembodiments of the present invention. In this regard, each block in theflowchart or block diagrams may represent a module, segment, or portionof code, which comprises one or more executable instructions forimplementing the specified logical function(s). It should also be notedthat, in some alternative implementations, the functions noted in theblock may occur out of the order noted in the figures. For example, twoblocks illustrated in succession may, in fact, be executed substantiallyconcurrently, or the blocks may sometimes be executed in the reverseorder, depending upon the functionality involved. It will also be notedthat each block of the block diagrams and/or flowchart illustration, andcombinations of blocks in the block diagrams and/or flowchartillustration, can be implemented by special purpose hardware-basedsystems that perform the specified functions or acts, or combinations ofspecial purpose hardware and computer instructions.

The descriptions of the various embodiments of the present inventionhave been presented for purposes of illustration, but are not intendedto be exhaustive or limited to the embodiments disclosed. Manymodifications and variations will be apparent to those of ordinary skillin the art without departing from the scope and spirit of the describedembodiments. The terminology used herein was chosen to best explain theprinciples of the embodiments, the practical application or technicalimprovement over technologies found in the marketplace, or to enableothers of ordinary skill in the art to understand the embodimentsdisclosed herein.

What is claimed is:
 1. A method comprising: receiving, in a distributedcomputing environment, a plurality of files for execution; identifying,by parsing the plurality of files, code segments contained in each ofthe plurality of files; determining, based on a comparison of the codesegments and definitions contained in a distributed computing basicfunction library, a first group of the code segments that includeconfiguration tasks and a second group of the code segments that includecomputational tasks; combining the first group of the code segments toform a super configuration task; creating an executable code, whereinthe executable code comprises the super configuration task and thesecond group of the code segments; allocating the executable code to oneor more nodes of the distributed computing environment; and executingthe executable code on the one or more nodes of the distributedcomputing environment.
 2. The method of claim 1, wherein the superconfiguration task is only executed once.
 3. The method of claim 1,wherein the plurality of files are written based on the distributedcomputing basic function library.
 4. The method of claim 1, wherein thedetermining the first group of the code segments and the second group ofthe code segments comprises: identifying, from each of the plurality offiles based on the definitions contained in the distributed computingbasic function library, scheduling tasks and network resource managementtasks as configuration tasks; and identifying tasks that are notconfiguration tasks as being computational tasks.
 5. The method of claim4, wherein the scheduling tasks are tasks used for launching one or moreof the computational tasks and the network resource management tasks aretasks used for managing network resources needed for executing one ormore of the computational tasks.
 6. The method of claim 5, wherein thenetwork resource management tasks include network initialization tasksand network clear-up tasks.
 7. The method of claim 6, wherein thenetwork initialization tasks include collecting network addresses forthe computational tasks to form a network address table.
 8. The methodof claim 7, wherein the network clear-up tasks include clearing thenetwork address table.
 9. The method of claim 1, wherein the executingthe executable code on one or more nodes of the distributed computingenvironment comprises: executing the super configuration task and one ormore of the computational tasks in serial; and executing one or more ofthe computational tasks at least partially in parallel.